Dropoutgrad

计算 Dropout 操作的梯度。

\[dx_i = dy_i \cdot \text{mask}_i \cdot \text{scale}\]

其中 \(dy_i\) 是来自后一层的上游梯度，\(\text{mask}_i\) 是前向传播时使用的同一个掩码，\(\text{scale}\) 是前向传播时使用的同一个缩放因子。

输入：

input - 上游梯度张量的数据地址 (dy)。
scale - 前向传播时使用的缩放因子。
length - 张量的总元素数量。
mask - 前向传播时使用的掩码张量的数据地址。
core_mask - 核掩码。

输出：

output - 输出的梯度张量的数据地址 (dx)。

支持平台：

FT78NE MT7004

备注

FT78NE 支持fp32
MT7004 支持fp16, fp32

共享存储版本:

void fp_dropout_grad_s(float *input, float scale, int length, float *output, float mask, int core_mask)

void hp_dropout_grad_s(half *input, half scale, int length, half *output, half mask, int core_mask)

C调用示例：

//FT78NE示例
#include <stdio.h>
#include <dropoutgrad.h>
int main(int argc, char* argv[]) {
    float *dy = (float *)0xA0000000;    // Upstream gradient (dy), DDR
    float *dx = (float *)0xB0000000;    // Output gradient (dx)
    float *mask = (float *)0xC0000000;  // Mask from forward pass

    int length = 4096;
    // 假设前向传播时 dropout 概率 p = 0.2
    float scale = 1.0f / (1.0f - 0.2f); // scale = 1.25
    int core_mask = 0xff;

    fp_dropout_grad_s(dy, scale, length, dx, mask, core_mask);
    return 0;
}

私有存储版本:

void fp_dropout_grad_p(float *input, float scale, int length, float *output, float mask)

void hp_dropout_grad_p(half *input, half scale, int length, half *output, half mask)

C调用示例：

//FT78NE示例
#include <stdio.h>
#include <dropoutgrad.h>
int main(int argc, char* argv[]) {
    float *dy = (float *)0x10000000;    // Upstream gradient (dy), L2
    float *dx = (float *)0x11000000;    // Output gradient (dx)
    float *mask = (float *)0x12000000;  // Mask from forward pass

    int length = 1024;
    // 假设前向传播时 dropout 概率 p = 0.5
    float scale = 1.0f / (1.0f - 0.5f); // scale = 2.0

    fp_dropout_grad_p(dy, scale, length, dx, mask);
    return 0;
}